40 research outputs found

    J-MOD2^{2}: Joint Monocular Obstacle Detection and Depth Estimation

    Full text link
    In this work, we propose an end-to-end deep architecture that jointly learns to detect obstacles and estimate their depth for MAV flight applications. Most of the existing approaches either rely on Visual SLAM systems or on depth estimation models to build 3D maps and detect obstacles. However, for the task of avoiding obstacles this level of complexity is not required. Recent works have proposed multi task architectures to both perform scene understanding and depth estimation. We follow their track and propose a specific architecture to jointly estimate depth and obstacles, without the need to compute a global map, but maintaining compatibility with a global SLAM system if needed. The network architecture is devised to exploit the joint information of the obstacle detection task, that produces more reliable bounding boxes, with the depth estimation one, increasing the robustness of both to scenario changes. We call this architecture J-MOD2^{2}. We test the effectiveness of our approach with experiments on sequences with different appearance and focal lengths and compare it to SotA multi task methods that jointly perform semantic segmentation and depth estimation. In addition, we show the integration in a full system using a set of simulated navigation experiments where a MAV explores an unknown scenario and plans safe trajectories by using our detection model

    Perception-aware Path Planning

    Full text link
    In this paper, we give a double twist to the problem of planning under uncertainty. State-of-the-art planners seek to minimize the localization uncertainty by only considering the geometric structure of the scene. In this paper, we argue that motion planning for vision-controlled robots should be perception aware in that the robot should also favor texture-rich areas to minimize the localization uncertainty during a goal-reaching task. Thus, we describe how to optimally incorporate the photometric information (i.e., texture) of the scene, in addition to the the geometric one, to compute the uncertainty of vision-based localization during path planning. To avoid the caveats of feature-based localization systems (i.e., dependence on feature type and user-defined thresholds), we use dense, direct methods. This allows us to compute the localization uncertainty directly from the intensity values of every pixel in the image. We also describe how to compute trajectories online, considering also scenarios with no prior knowledge about the map. The proposed framework is general and can easily be adapted to different robotic platforms and scenarios. The effectiveness of our approach is demonstrated with extensive experiments in both simulated and real-world environments using a vision-controlled micro aerial vehicle.Comment: 16 pages, 20 figures, revised version. Conditionally accepted for IEEE Transactions on Robotic

    Transferring knowledge across robots: A risk sensitive approach

    Get PDF
    One of the most impressive characteristics of human perception is its domain adaptation capability. Humans can recognize objects and places simply by transferring knowledge from their past experience. Inspired by that, current research in robotics is addressing a great challenge: building robots able to sense and interpret the surrounding world by reusing information previously collected, gathered by other robots or obtained from the web. But, how can a robot automatically understand what is useful among a large amount of information and perform knowledge transfer? In this paper we address the domain adaptation problem in the context of visual place recognition. We consider the scenario where a robot equipped with a monocular camera explores a new environment. In this situation traditional approaches based on supervised learning perform poorly, as no annotated data are provided in the new environment and the models learned from data collected in other places are inappropriate due to the large variability of visual information. To overcome these problems we introduce a novel transfer learning approach. With our algorithm the robot is given only some training data (annotated images collected in different environments by other robots) and is able to decide whether, and how much, this knowledge is useful in the current scenario. At the base of our approach there is a transfer risk measure which quantifies the similarity between the given and the new visual data. To improve the performance, we also extend our framework to take into account multiple visual cues. Our experiments on three publicly available datasets demonstrate the effectiveness of the proposed approach

    A Minimum Energy solution to Monocular Simultaneous Localization and Mapping

    Get PDF
    In this paper we propose an alternative solution to the Monocular Simultaneous Localization and Mapping (SLAM) problem. This approach uses a Minimum-Energy Observer for Systems with Perspective Outputs and provides an optimal solution. Contrarily to the most famous EKF-SLAM algorithm, this method yields a global solution and no linearization procedures are required. Furthermore, we show that the estimation error converges exponentially fast toward a neighborhood of zero, where this region increases gracefully with the magnitude of the input disturbance, output noise and initial camera position uncertainty. For practical purposes, we present also the filter in both continuous and discrete time form. Moreover, to show how to integrate a new landmark in the state estimation, a simple initialization procedure is presented. The filter performances are illustrated via simulations

    Robust stabilization for SISO linear systems via output feedback

    No full text
    The authors consider the problem of the robust stabilization, via output feedback, of single-input single-output (SISO) linear systems. A dynamic output feedback controller is proposed under the assumptions that the relative degree is known, the system is minimum phase, and the sign of the high frequency gain and a bound for its modulus are known. The proposed procedure is widely applicable because the dynamics of the system can be completely unknown, e.g., the number of poles, stable and/or unstable, can be unknown and arbitrarily large, and the high frequency gain is bounded but can be unknown

    Use of observers for the inversion of nonlinear maps: An application to the inverse kinematic problem

    No full text
    The authors define in a general framework several problems related to inverse kinematics already present in the literature. Six problems are introduced: given an extended reference trajectory, the first three problems consist in finding, both in an exact and in an approximate way, an extended inverse reference trajectory. The last three problems consist in finding, both in an exact and in an approximate way, an extended inverse reference trajectory without the knowledge of the time derivatives of the reference trajectory. It is then shown how a solution to the above problems can be obtained by constructing an observer for a certain time-varying nonlinear system, which can be associated with the direct kinematics and the given reference trajectory, and by proposing some observer structures for such a nonlinear system. All the definitions and the theory reported are applied to a simple example: a two-link robot arm

    Ball Detection and Predictive Ball Following Based on a Stereoscopic Vision System

    No full text
    Abstract — In this paper we describe an efficient software architecture for object-tracking, based on a stereoscopic vision system, that has been applied to a mobile robot controlled by a PC. After analyzing the epipolar rectification required to correct the original stereo-images, it is described a new valid and efficient algorithm for ball recognition (indeed circle detection) which is able to work in different lighting conditions and in a manner faster than some modified versions of Circle Hough Transform. Then, we show that stereo vision, besides giving an optimum estimation of the 3D position of the object, is useful to remove lots of the false identifications of the ball, thanks to the advantages of epipolar constraint. Finally, we describe a new strategy for ball following, by a mobile robot, which is able to ”look for ” the object whenever it comes out of the cameras view, by taking advantage of a ”block matching ” method similar to that of MPEG Video. Index Terms — ball detection, ball tracking, following, predictive, stereoscopic vision I

    Towards domain independence for learning-based monocular depth estimation

    Full text link
    Modern autonomous mobile robots require a strong understanding of their surroundings in order to safely operate in cluttered and dynamic environments. Monocular depth estimation offers a geometry-independent paradigm to detect free, navigable space with minimum space, and power consumption. These represent highly desirable features, especially for microaerial vehicles. In order to guarantee robust operation in real-world scenarios, the estimator is required to generalize well in diverse environments. Most of the existent depth estimators do not consider generalization, and only benchmark their performance on publicly available datasets after specific fine tuning. Generalization can be achieved by training on several heterogeneous datasets, but their collection and labeling is costly. In this letter, we propose a deep neural network for scene depth estimation that is trained on synthetic datasets, which allow inexpensive generation of ground truth data. We show how this approach is able to generalize well across different scenarios. In addition, we show how the addition of long short-term memory layers in the network helps to alleviate, in sequential image streams, some of the intrinsic limitations of monocular vision, such as global scale estimation, with low computational overhead. We demonstrate that the network is able to generalize well with respect to different real-world environments without any fine tuning, achieving comparable performance to state-of-the-art methods on the KITTI dataset
    corecore